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Abstract 

Often of primary interest in the analysis of multivariate data are the copula pa- 
rameters describing the dependence among the variables, rather than the univariate 
marginal distributions. Since the ranks of a multivariate dataset are invariant to 
changes in the univariate marginal distributions, rank-based procedures are natural 
candidates as semiparametric estimators of copula parameters. Asymptotic informa- 
tion bounds for such estimators can be obtained from an asymptotic analysis of the 
rank likelihood, i.e. the probability of the multivariate ranks. In this article, we ob- 
tain limiting normal distributions of the rank likelihood for Gaussian copula models. 
Our results cover models with structured correlation matrices, such as exchangeable, 
autoregressive and circular correlation, as well as unstructured correlation matrices. 
For all Gaussian copula models, the limiting distribution of the rank likelihood ratio is 
shown to be equal to that of a parametric likelihood ratio for an appropriately chosen 
multivariate normal model. This implies that the semiparameteric information bounds 
for rank-based estimators are the same as the information bounds for estimators based 
on the full data, and that the multivariate normal distributions are least favorable. 
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1 Rank likelihood for copula models 



Recall that a copula is a multivariate CDF having uniform univariate marginal distributions. 
For any multivariate CDF F(yx, . . . ,y p ) with continuous margins Fx, . . . , F p , the correspond- 
ing copula C(ux, ■ ■ ■ , Up) is given by 

C( Ul , ...,u p ) = FiF^ux), . . . , F-\u p )). 



Sklar's theorem |Sklar 1959 shows that C is the unique copula for which F(yx, . . . ,y p ) 



C(Fx(yx), ■ ■ ■ , F p (y p )). 

In this article we consider models consisting of multivariate probability distributions for 
which the copula is parameterized separately from the the univariate marginal distributions. 
Specifically, the models we consider consist of collections of multivariate CDFs {F(y\9,ip) : 
y G W, (9,ip) £ 8x$} such that ip parameterizes the univariate marginal distributions and 
9 parameterizes the copula, meaning that for a random vector Y = (Yx, . . . , Y P ) T with CDF 
F{y\9,^), 

vi{Y 3 < yj \e^) = i^ivo wee, j = i,..., P 

Pr(F^(Yx\iP) <ux,..., F-\Y P ) <u p \9,ijj) = C( Ul , . . . , u p \9) W e *. 

We refer to such a class of distributions as a copula-parameterized model. For such a model, 
it will be convenient to refer to the class of copulas {C(u\9) : 9 e 0} as the copula model, 
and the class {Fx(y\ip), . . . , F p (y\ip) : ip e ty} as the marginal model. 

As an example, the copula model for the class of p-variate multivariate normal distribu- 
tions is called the Gaussian copula model, and is parameterized by letting G be the set of 
p x p correlation matrices. The marginal model for the p-variate normal distributions is the 
set of all p-tuples of univariate normal distributions. The copula-parameterized models we 



focus on in this article are semiparametric Gaussian copula models Klaassen and Wellner 



1997 , for which the copula model is Gaussian and the marginal model consists of the set of 



all p-tuples of continuous univariate CDFs. 

Let Y be an n x p random matrix whose rows Yx, • • • , Y n are i.i.d. samples from a p- 
variate population. We define the multivariate rank function R(Y) : M. nxp — > M. nxp so that 
Rij, the i, jth element of R(Y), is the rank of Y^j among {Yxj,. . . ,Y n j}. Note that the 
ranks R(Y) are invariant to strictly increasing transformations of the columns of Y, and 
therefore the probability distribution of R{Y) does not depend on the univariate marginal 
distributions of the p variables. As a result, for any copula parameterized model and data 
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matrix y G IR nxp with ranks R{y) = r, the likelihood L{6,ip : y) can be decomposed as 

L(0,V:y)=Ky|M) = Pr(i2(Y) = r|M) x p(y|M,r) 

L(0:r) x L(0,-0: [y|r]), 

where p(y|0,VO is the joint density of Y and p(y|#,?/>,r) is the conditional density of Y 
given -R(Y) = r. The function L{6 : r) = Pr(i?(Y) = r\9) is called the rank likelihood 
function. In situations where 9 is the parameter of interest and ip a nuisance parameter, the 
rank likelihood function can be used to obtain estimates of 9 without having to estimate the 
margins or specify a marginal model. A univariate rank likelihood function was proposed by 



Pettitt 1982 for estimation in monotonically transformed regression models. Asymptotic 



properties of the rank likelihood for this regression model were studied by Bickel and Ritov 



1997 , and a parameter estimation scheme based on Gibbs sampling was provided in Hoff 



2008 . Rank likelihood estimation of copula parameters was studied in Hoff 2007 , who also 



extended the rank likelihood to accommodate multivariate data with mixed continuous and 
discrete marginal distributions. 

The rank likelihood is constructed from the marginal probability of the ranks and can 
therefore be viewed as a type of marginal likelihood. Marginal likelihood procedures are often 



used for estimation in the presence of nuisance parameters (see Section 8.3 of Severini 2000 



for a review). Ideally, the statistic that generates a marginal likelihood is "partially sufficient" 
in the sense that it contains all of the information about the parameter of interest that can be 
quantified without specifying the nuisance parameter. Notions of partial sufficiency include 



G-sufficiency Barnard, 1963 and L-sufficiency Remon, 1984 , which are motivated by group 



invariance and profile likelihood, respectively. Hoff 2007 showed that the ranks -R(Y) are 
both a G- and L-sufficient statistic in the context of copula estimation. 

Although rank-based estimators of the copula parameter 9 may be appealing for the 
reasons described above, one may wonder to what extent they are efficient. The decom- 
position given in ([I]) indicates that rank-based estimates do not use any information about 
9 contained in L(9,tp : [j/|t*]), i.e. the conditional probability of the data given the ranks. 



For at least one copula model, this information is asymptotically negligible: Klaassen and 



Wellner 1997 showed that for the bivariate normal copula model, a rank-based estimator 



is semiparametrically efficient and has asymptotic variance equal to the Cramer-Rao infor- 
mation bound in the bivariate normal model, i.e. the bivariate normal model is the least 



favorable submodel. Genest and Werker 2002 studied the efficiency properties of pseudo- 
likelihood estimators for two-dimensional semiparametric copula models and show that the 
pseudo-likelihood estimators (which are functions of the bivariate ranks) are not in general 
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semiparametrically efficient for non-Gaussian copulas. Chen et al. 2006 proposed estima- 
tors in general multivariate copula models that achieve semiparametric asymptotic efficiency 
but are not based solely on the multivariate ranks. It remains unclear whether estimators 
based solely on the ranks can be asymptotically efficient in general semiparametric copula 
models. In particular, it is not yet known if maximum likelihood estimators based on rank 
likelihoods for Gaussian semiparametric copula models are semiparametrically efficient. 

The potential efficiency loss of rank-based estimators can be investigated via the limiting 
distribution of an appropriately scaled rank likelihood ratio. Generally speaking, the local 
asymptotic normality (LAN) of a likelihood ratio plays an important role in the asymptotic 
analysis of testing and estimation procedures. For semiparametric models, the asymptotic 



variance of a LAN likelihood ratio can be related to efficient tests Choi et al. , 1996 and 



information bounds for regular estimators Begun et al. , 1983 Bickel et al. 1993 . In partic- 
ular, the variance of the limiting normal distribution of a LAN rank likelihood ratio provides 
information bounds for locally regular rank-based estimators of copula parameters. 

In this article we obtain the limiting normal distributions of the rank likelihood ratio 
for Gaussian copula models with structured and unstructured correlation matrices. In the 
next section we develop several lemmas that give sufficient conditions under which the rank 
likelihood is LAN. The basic result is that the rank likelihood is LAN if there exists a good 
rank-measurable approximation to a LAN submodel. For Gaussian copulas, the natural can- 
didate submodels are multivariate normal models, for which the likelihood is quadratic in the 
observations. In Section 3, we prove a theorem that identifies the types of normal quadratic 
forms that have good rank-measurable approximations. This result allows us to identify 
multivariate normal submodels with likelihood ratios that asymptotically approximate the 
rank likelihood ratio. In Section 4, this result is applied to a class of Gaussian copula models 
for which the rows of the correlation matrices are permutations of each other. This class 
includes exchangeable, block-exchangeable and circular correlation models, among others. 
For a model in this class, the limiting distribution of the rank likelihood ratio is shown to be 
the same as that of the parametric likelihood ratio for the corresponding multivariate normal 
model having equal marginal variances. More generally, in Section 5 we show that for any 
smoothly parameterized Gaussian copula, the rank likelihood ratio is LAN with an asymp- 
totic variance equal to that of the likelihood ratio for the corresponding multivariate normal 
model with unequal marginal variances. Since the parametric multivariate normal model is a 
submodel of the semiparametric Gaussian copula model, and in general the semiparametric 
information bound based on the full data is higher than that of any parametric submodel, 
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our results imply that the bounds for rank-based estimators are equal to the semiparametric 
bounds for estimators based on the full data, and that the multivariate normal models are 
least favorable. This is discussed further in Section 6. 



2 Approximating the rank likelihood ratio 

The local log rank likelihood ratio is defined as 

L(6 + s/yfn:r) 



X r (s) = log 



L(9 : r) 

where L(9 : r) is defined in Studying A r is difficult because L{9 : r) is the integral of 
a copula density over a complicated set defined by multivariate order constraints. However, 
in some cases it is possible to obtain the asymptotic distribution of X r by relating it to the 
local log likelihood ratio X y of an appropriate parametric multivariate model. In this section, 
we will show that if we can find a sufficiently good rank-measurable approximation Xy of X y , 
then the limiting distribution of A r will match that of X y . This is analogous to the approach 



taken by Bickel and Ritov 1997 in their investigation of the rank likelihood ratio for a 
univariate semiparametric regression model. 

Define the local log likelihood ratio of a copula parameterized model as 

, , _ L{6 + s/y/E,iP + t/y/E:y) 
VM)-log L(8^:y) ' (2) 

where L(9,ip : y) is the (parametric) likelihood function for a given dataset y G M nxp . The 
lack of dependence of the rank likelihood on the marginal distributions leads to the following 
identity relating X r (s) to X y (s,t): 

Lemma 2.1. Let Y be a random data matrix with rows Y\, . . . , Y n ~ F(y\8, ip). For every 
value of ip, s and t such that 9,9 + s/y/n G O and ip,ip + t/y/ri G ^ , 

X r =\ogE[e Xy \R{Y) = r]. 

Proof. 

logE [e » \R Y = r = log / — dy 

Yi{R{Y)=y\9 + s/^E) _ 
g Pr(R(Y) = r\9) 

□ 
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Note that the identity holds for any marginal model and for any value of ip and t for 
which ip and if) + tj^fn are in the parameter space. Therefore, the numerical value of A r is 
invariant to the marginal distribution under which it is calculated. 

Now suppose we would like to describe the statistical properties of A r when the matrix 
r is replaced by the ranks -R(Y), where the rows of Y are i.i.d. samples from a population 
with copula C(u\8). Since the distribution of the ranks of Y is invariant with respect to the 



marginal distributions, the choice of the marginal model in Lemma 2A is immaterial and 
can be selected to facilitate analysis. Our strategy will be to select a marginal model for 
which a rank-based approximation A^ of X y is available. If Ay is rank-measurable, then 

A r = logE[e A -| J R(Y)] 

= A 5 + logE[e A ^ A ^| J R(Y)]. 

If the approximation of X y by A^ is sufficiently accurate to make the remainder term, 
logE[e y ~ s |i2(Y)], converge in probability to zero as n — > oo, then the asymptotic dis- 
tribution of A r is determined by that of X y . As shown by the following proposition, the local 
asymptotic normality (LAN) of X y , along with A^ — A^ = o p (l), implies convergence of the 
remainder. In what follows, all limits are as n — > oo unless otherwise noted. 

Proposition 2.2. Let X y be LAN and let X y be a rank-measurable approximation to X y , such 
that X y - X y A 0. IfY l ,...,Y n ~ i.i.d. F(y\0,ip), then logE[e^- A *|i?(Y)] A 0. 



This proposition was essentially proven at the end of the proof of Theorem 1 of Bickel 



and Ritov 1997 in the context of the regression transformation model, although details were 



omitted. We include the proof here for completeness. The proof of Proposition makes 
use of the following lemma about conditional expectations: 

Lemma 2.3. Tjf E[|X n |] — > and Z n is a random sequence, then E[X n \Z n ] A 0. 

Proof. By Markov's inequality, 

Pv(\E[X n \Z n ]\ >e) < E[|E[X n |Z„]|]/e 
< E[E[|X n ||Z n ]]/e 
= E[\X n \]/e 0. 

□ 

In particular, note that if X n — > and X n is bounded or uniformly integrable, then 
E[|X n |] -> and so E[X n \Z n ] A 0. 
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Proof of proposition \2.S\ . Let U n = e y , V n = e » and R„ = R(Yi, . . . , Y n ), so that the 

'Un 
■Vn 



exponential of the remainder term can be written as E[^-|R n ]. For any M > 1 we can write 



|E[f -l|R n ]|<E[|f -1||R„] 

= E [l^ - l|l(^< M )|Rn] +E[|^ - l|l ( ^ >M) |R n ] 

^ E [l^ - 1|1(^< M )I R «] + E [^ 1 (^>M)I R J 
+ E[l ( Ci L>M) |R n ] 

= E l\v: " M^i^M^n) + V n 1 H U nl(U_n >M) \R n ] 

+ Pr(^>M|R„) 

~t~ b n ~\~ C n . 

We will show that each of a n , b n and c n converge in probability to zero. To do so, we will 
make use of the following facts: 

1- U n /V n = e Xy ( s ^~ x y^ A 1 by the continuous mapping theorem; 

2. U n = e Xy ^ and V~ x = e~ Xy( - s ^ are bounded in probability, as \ y (s) and X y (s) converge 
in distribution. 



3. {U n '■ n G N} is uniformly integrable, since \ogU n = X y (s) is LAN Hall and Loynes 



1977 



To see that a n and c n converge in probability to zero, note that both \ J ^ L — 1|1 



and l^Un, >M ) are bounded random variables that converge in probability to zero, so their 
conditional expectations given R n converge in probability to zero by the lemma. 

For the sequence b n , note that U n is O p (l) as it converges in distribution, and l(Un >M \ 

is o p (l) as y 1 - — > 1, so U n = U n l(U 11>M j is o p (l). Now < U n < U n for each n, and 
{U n : n G N} is uniformly integrable, so {U n : n G N} is uniformly integrable as well. This 
and U n A imply that E[|{/ n |] = E[U n ] — > 0, and so E[£7" n |R n ] A by the lemma. Since 
b n = V-'EpnlRn], and V~ l is O p (l), b n is o p (l). □ 

Now if A y is LAN then A y A Z where Z has a normal distribution. If the conditions 



of Proposition 2.2 hold, then we must also have A^ A Z and A r A Z. We summarize the 



results of Lemma 2.1 and Proposition 2.2 with the following theorem: 
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Theorem 2.4. Let {F{y\9,ip) : 9 G Q,ip G ^} fre a copula parameterized model where for 
given values of 6 and s there exists values of ip and t such that under i.i.d. sampling from 
F(y\e^), 

1. Xy is LAN, so that X y -4 Z and 

2. there exists a rank-measurable approximation X y such that X y — X y A 0. 

Then X r -4 Z as n — > oo under i.i.d. sampling from any population with copula C(u\8) equal 
to that of F(y\9,ip) and arbitrary continuous marginal distributions. 

Note that under the conditions of the theorem and sampling from F(y\9,ip), the differ- 
ences between each pair of A^, A^ and A r converge to zero and all three of these log-likelihood 
ratios converge to the same normal random variable. If the data are being sampled from 
a population with the same copula as F(y\9,ip) but different margins, then there exists a 
transformation of the data such that F(y\9,ip) is the distribution of the transformed popu- 
lation. Thus, all we need is that conditions 1 and 2 hold for some marginal model and values 
of ip and t. This is enough to give the convergence of A r to a normal random variable under 
sampling from any population with continuous marginal distributions and the same copula 
as F(y\9,i>). 



3 Rank approximations to normal quadratic forms 

Let Y 1 , . . . ,Y n be i.i.d. samples from a member of a class of mean-zero p-variate normal 
distributions indexed by a correlation parameter 9 G and a variance parameter ip G ^. As 
discussed in the next section, the local likelihood ratio X y can be expressed as a quadratic 
function of Y"i, . . . , Y n , taking the form 

for some matrix A which could be a function of s, t, 9 and ifi. A natural rank-based 
approximation to A^ is 



Xy(s,t) = (^J^Y^AY^j +c(9,i>,s,t), 



where {Yy : i G {1, . . . ,n},j G are the (approximate) normal scores, defined 

by R = R(Y) and Y id = y/Vai\Yij\i>] x ). Whether or not X y - X y ->■ therefore 
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depends on the convergence to zero of the difference between the quadratic terms of X y and 



Xy. 



Let Yi, . . . , Y n ~ i.i.d. N p (0, C), where C is a correlation matrix, so the normal scores 
are given by Y i; j = In this section, we will find conditions on C and A such that 



1 n 

S n = —= ( Y l AY i - Y J AY i) ^0 as n -»■ oo. 

v n i=i 



Let A = (A + A ) /2, so that y T Ay = y T Ay for all y G M p . Then 

= (y-y) r A(y-y) + 2(y-y) T Ay, 

the latter equality holding since A is symmetric. From this, we can write S n = Q n + 2L r 
where 



, Y I ) T A(Y < 

n 



/n 



We will show that Q n A for all C and A, and find conditions on C and A under which 



L n A 0. To do this we make use of a theorem of 



de Wet and Venter 



1972 



Theorem (de Wet and Venter). Let Z\, . . . , Z n be i.i.d. standard normal variables. Let Ri 
be the rank of Zi, let Zi = $ _1 [-Rj/ (n + 1)], and let W n = ^2(Zi — Zj) 2 . Then 

W n -v n A 7, 

where v n is a sequence of constants such that C\ log log n < v n < C2 log log n for all n and 
some Ci, C 2 > 0, and Epy] = and Var[7] = 7r 2 /3. 

From this theorem we have the following corollary: 

Corollary 3.1. Let Z%, . . . , Z n be i.i.d. standard normal random variables. Then ^2(Zi — 
Zj) 2 /loglogn is bounded in probability, and XX — Zi) 2 /y/n A 0. 

Proof. Let W n = W n /(loglogn) and v n = t> n /(loglogn), where W n and v n are defined as 
in the theorem. Then \W n \ < \W n — v n \ + \v n \. Now W n — v n converges in distribution so 
it is O p (l), and so W n — v n is o p (l). The deterministic term v n is bounded by C 2 , and so 
\W n — v n \ + \v n \ is bounded in probability, implying \W n \ is bounded in probability as well. 
Convergence in probability of ^ (Zi — Zi) 2 / ^fn to zero then follows. □ 
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Returning to S n , to see that Q n A note that 

Q n = ^2 ~ a ij f YX Y h3 ~ ^J') 2 ) + ^'> fc f ^(^J ~~ Y ij)(Yi,k ~ Y i,k) 
3=1 \ Vn i=i / j#* V Vn ,=i 

The squared terms converge in probability to zero by the corollary and the cross term 
converges in probability to zero by the Cauchy-Schwarz inequality. 
To find conditions under which L n — > 0, note that 

p 

(y - y) T ^y = Y^Vj ~ yj^Jy* 

3=1 

where di . . . , a p are the rows of A. This gives 

p p ^ n 

L n = Yj Ln,j = Y, yiO^'J ~ Y ^j)^ Y i- 



Jn 

3=1 3=1 v 1=1 



p 

Let Cj be the jth row of C, the correlation matrix of Y . We will show that L n j — > 
if ajcj = using an argument based on conditional expectations. Considering L nj i for 
example, we have 

E^iy^,...,^] = ±_J2( Y ^- Y ^M^Y t \Y hl } 

if ajcj = . The conditional expectation of L 2 n l is given by 

E[L 2 n>1 \Y ltl , . . . , Y n>1 ] = I Yfrl - Y hl fE[(^Y t f\Y hl ] + 



n 



The expectations in the second sum are both a\ci = 0, leaving 

1 n 

Var[L nil |F l5l , . . . ,Y n>1 ] = E[L 2 ntl \Y ltl , . . . ,Y n>1 ] = - ~ Y^E^Y tf^}. 



n . 
t=i 



The conditional expectation E[(df Y~j) 2 | Y^i] can be obtained by noting that if V ~ N p (0, C), 
then the conditional distribution of Y given Yi can be expressed as 

Y\Y X = dYi + Ge, 
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where GG T = C — Cicf and e is p-variate standard normal. The desired second moment is 
then 

E[(o?Y) 2 |yi] = a[E[YY T \Y 1 ]a 1 

= ajE[Y?CxcJ + 2Y lCl e T G T + Gee T G T |Fi]d 1 
= (If-lXafci^ + ofCo! 

which is equal to a\Ca,i under the condition that a\ci = 0. Letting jx = d^Cai, the 
conditional variance of L raj i given the observations for the first variate is then 



Applying Chebyshev's inequality gives 



Pr(|L n>1 | >e|y 1>1 ,...,r n , 1 ) < lAVar[L nil |y lil; ...,y nil ]/ e 2 

7i E(ni-ni 

1 A c n c n . 



,i) 2 



Now c n A by Corollary 
E[c n ] 0, giving 



3.1 



and therefore so does c n . But cLS IS bounded, we have 



PrGLn.il > e) = E[Pr(|L n>1 | >e|y li i,...,y n> i)] 
< E[c n ]^0, 

and so L n) i — » 0. The same argument can be applied to L nj - for each j, and so L n = 
Ej=i ^nj as long as ajcj = for each j = 1, . . . ,p, or equivalently, if the diagonal 
elements of AC + A T C are zero. This result is summarized in the following theorem: 

Theorem 3.2. Let Yi, . . . ,Y n ~ i.i.d. N p (0,C) where C is a correlation matrix, and let 
Yij = $~ where Rij is the rank of Y{j among Yij, . . . , F nj -. Let A be a matrix such 

that diag(AC + A T C) = 0. Then 



1 n 

^^(yfA^-yfA^)Ao 



as n — >■ oo. 



Note that if A is symmetric, the condition on A reduces to the diagonal elements of AC 
being zero. 
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4 Submodels having equal marginal variances 



4.1 The bivariate normal model 



Klaassen and Wellner 1997 showed that for the bivariate normal copula model with corre- 
lation 9 G [—1,1], a rank-based estimate (the normal scores rank correlation coefficient) is 
asymptotically efficient and has asymptotic variance equal to (1 — 6 2 ) 2 . That (1 — 9 2 ) 2 is a 
lower bound on the variance of rank-based estimators of 9 can be easily confirmed using the 
results of the previous section. 

As these authors note, this asymptotic variance is equal to that of the MLE for 9 in the 
case of the bivariate normal model with equal but unknown marginal variances, which thus 
constitutes a least favorable parametric submodel. This suggests that the local likelihood 
ratio for this submodel provides an asymptotic approximation to the rank likelihood ratio. 
To verify this, let l(y) be the log probability density for the class of mean-zero bivariate 
normal densities with correlation 9 and equal marginal precisions (inverse- variances) ip. The 
log likelihood derivatives are 

9(1 - 9 2 ) + ^y x y 2 {l + 9 2 ) - 4,{y 2 + y 2 )9 



(1-9 2 ) 2 
1 (y 2 1 +yi)/2-ey 1 y 2 



The information matrix is 




i+6> 2 e 

(1_02)2 ^,(1-6(2) 

e j_ 

V>(i-0 2 ) i> 2 



J = E[(VZ)(V/n = 
which gives the efficient score function 

i* g = i e - ityl^ij, = ^ d2 y (ym - + 2/D/2), 

the efficient influence function 

To = ym-9(y 2 1 +y 2 2 )/2, 

and the efficient information 

he-ip = ([^ _1 ]n) _1 = he - 

1 + 9 2 9 2 1 



[1 - 9 2 ) 2 (1 - 9 2 ) 2 (1 - 9 2 ) 2 
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and so Igg 1 ^ = (1 — 2 ) 2 is the information bound for estimators of 6 in this parametric model. 

If Yi, . . . , Y n are i.i.d. iV 2 (0, C(0)/^), where C(0) = ( 
X y (s,t) defined by ([2]) has a limiting normal distribution: 



J J), then the local likelihood ratio 



Xy(s,t) 



n 

-T 

fn 



[s X lg(Yi) + t X l^Yi)] - [s, t]I[s, t} 1 /2 + 0p (l) 



A N(-[s,t]I[s,t] r /2,[s,t]I[s,tf) asn^oo 



Letting i = —I^I^qs, the local log likelihood ratio X y (s) can be expressed as 



X y (s) 



which converges as n — > oo to a N(—s 2 Igg.^/2, s 2 Igg.^) random variable if Yi, . . . , Y n are 
i.i.d. -/V 2 (0, C(9)/ip). Note that 7^.^,, and therefore the asymptotic distribution of X y , do not 
depend on the value of ip. Letting ip — 1, the value of lg — Ig^I^l^ is given by 

yiy 2 -0{y! + yi)/2 



k{y) -h^LMy) 



[i-e 2 y 



where A is the symmetric matrix 

A 



-6 
1 



1 



2(1 -e 2 ) 2 

We are now in a position to apply the results of the previous sections. If Y\, 
i.i.d. JV 2 (O,C(0)), then X y (s) is LAN, so that 

X y (s) = -^=J2 Y ^ AY * - s2/ W 2 + Op(1) ^ iV(- s 2 /^/2, s 2 / e ^). 

A rank-measurable approximation to X y (s) is given by 



,Y n ~ 



\(s) 



^rfAY- S 2 W2. 



Letting C be the bivariate correlation matrix with correlation 6, it is easy to check that the 
diagonal elements of AC are zero. Since A is also symmetric, the conditions of Theorem 



3.2 are met and we have X y — X y = o p (l). By Theorem 2.4, under i.i.d. sampling from a 



population with a bivariate normal copula with correlation 8, we have 

X r {s) A N(-S 2 Igg. i> /2,S 2 Igg. i> ). 

The semiparametric information bound for rank-based estimators of 6 is thus Ig~ e \, confirm- 



ing the result of Klaassen and Wellner 1997 
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4.2 Multivariate models 

A natural question is whether or not the above result can be extended to p-variate Gaussian 
copula models where p > 2. In this section we identify a class of correlation models for which 
the limiting distribution of the rank likelihood ratio is the same as that of the likelihood ratio 
for the corresponding parametric multivariate normal model with equal marginal variances. 

Let {N p (O,C(0)/ip) : G G M + } denote a class of p-variate mean- zero normal 

distributions, where ifj parameterizes the common marginal precision and the correlation 
matrix is parameterized as C(0), a twice-differentiable matrix- valued function of G C 
IR 9 . For the calculations that follow, it will be convenient to express the likelihood in terms 
of the inverse correlation matrix B(0) = C(0) _1 , giving 

l(y : 0,V) = (— plog27r + plogip + log |B| — ipy T 'By)/2. (3) 

The corresponding likelihood derivatives are 

Uv) = (p/*I> - y TB v)/ 2 kAv) = -y TB e k y/2 

i 6k (y) = (tr(B flfc C) - ^y T B dk y)/2 l^(y) = -p/(2^ 2 ), 

where B^ is the matrix of derivatives of the elements of B with respect to a particular 
element 6 k of 0. The parametric local likelihood ratio X y (s,t) can be written as 

n 

V*> t) = -= YyiBtYi) + ti4Y t )] - [ f fi[ f ]/ 2 + o„(i) 

v 1=1 

which, under i.i.d. sampling from N p (0, C(d)/ip), converges in distribution to a N(—u T Iu/2, 
u T Iu) random variable, where u T = (s T ,t) and I is the information matrix for (0, ip). The 
elements of / needed for the results in this section are 

he = {W = {-tr(BC,J/(2^} 

1^ = p/(2^ 2 ), 

where the fact that B<2 fe C = — BC^ has been used in the calculation of I^q. If we set 
t = —I^I^s, then \ y converges in distribution to a N(—s T Igg.^s/2, s T Iee-ips) random 
variable, where Iqq.^ = lee — le^Ie^l ls ^ e information for in this parametric model. 

We will now find conditions on C(0) under which X r (s) converges in distribution to the 
same normal random variable. A candidate rank-measurable approximation to X y (s,t) is 
given by 

= ^£[* T io(Y i ) - [f ] T /[f ]/2. 
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Recall that if for our given s and 6 we can find a t and ip such that A^ — X y = o p (l), then the 



conditions of Theorem 2.4 will be met and the asymptotic distribution of A r (s) will be that 
of X y (s,t). With this in mind, let t = h T s for some h G M 9 , and write X y (s, h T s) = X y (s). 
We will find conditions on C(0) such that there exists an h for which X y (s) — X y (s) = o p (l), 
and will show that any such h must be equal to —I^I^q. 



With t = h T s and if) — 1, we have 



s T ie(y)+tl4y) = s T [i e (y) + hi4y)} 



k=l 

q 



s k y T (Be k + /i fc B)y/2 + c(0, s, h) 

k=i 

<i 

^2 s k y T A k y + c(0, s, h), 



k=l 



where A k = — (Be k + ^fcB)/2 = (BC^B — h k 'B)/2 and c(0, s, h) does not depend on y. The 



difference between X y and A^ is then 



h ~ X v = Yl Sk ( ~T S *"* A *^ ~ YiA k Yi J + o p (l). 



k =i \ v i=i 



Since is symmetric, Theorem 3.2 implies that this difference will converge in probability 
to zero if the diagonal elements of A^C are zero for each k = 1, . . . ,q. This condition can 
equivalently be written as follows: 

= diag(A fc C) 

= diag(BC efe BC - h k BC)/2 
= diag(BCV. - h k T)/2 
h k l = diag(BC e J. 

The above condition can only be met if, for each k, the diagonal elements of BCg fe all take 
on a common value. If they do, then the convergence in probability of X y (s,t) — X y (s,t) to 
zero can be obtained by setting t = h T s, where h k = tr(BCo k )/p. 

Note that in this case where ip = I, h k = ti(BCe k ) /p = —I^I^e k . By setting t = 
—I^I^os = h T s, the value of A^ will satisfy the conditions of Theorem 



3.2 



for each k 



1, . . . , q. Therefore, X y — X y = o p (l) under i.i.d. sampling from N p (0, C(6)/ip), and so the 
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conditions of Theorem 2.4 are met. The limiting distribution of X r under sampling from the 
copula is therefore the same as that of X y under sampling from the corresponding multivariate 
normal distribution. 



Theorem 4.1. Let {C(0) : G C 1R 9 } be a collection of correlation matrices such that 
C(0) is twice differentiable, and for each k, the diagonal entries o/BC^ are equal to some 
common value. IfYi, ■ ■ ■ ,Y n are i.i.d. from a population with continuous marginal distri- 
butions and copula C(0) for some G 0, then the distribution of the rank likelihood ratio 
X r (s) converges to a N(—s T Igg.^s/2,s T Igg.^s) distribution, where Igg.^p is the information 
for in the normal model with correlation C(0) and equal marginal precisions ip. 



4.3 Examples 

The condition on the diagonal entries of BC^ is satisfied for some well-known models. For 
example, the one-parameter exchangeable correlation matrix {C(8) : 9 G [—1,1]}, for which 
all off-diagonal elements are equal to 9, satisfies the condition. In fact, the condition will be 
satisfied for any model in which the rows of C(0) are permutations of one another. To see 
this, note that if Cj, the ith row of C(0), is a permutation of Cj, then b iy the ith row of B, 
is the same permutation of bj. Therefore bjce k ^ = bjce k j for each i, j and k. Subclasses of 
such correlation matrices includes circular correlation models, often used for seasonal data 



Olkin and Press, 1969, Khattree and Naik, 1994 , and any model in which the rows of C 



are permutations of circular matrices. 

For illustration, we calculate the information Iqq.^ for two one-parameter models satisfy- 
ing the symmetry condition on BCg. For any one-parameter model, the second derivative 
of the log-likelihood with respect to 9 is 

leeiv) = i(tr(B e C)-^ T B e? /)/2 

= (tr(B e C e ) + tr(B efl C) - ijy T B ee y)/2 

= (-tr(BC„BC e ) + tr(B ee C) - ^y T B e0 y)/2 

and so the information for 9 if ip were known is I$q = —E[leg(Y)] = tr(BCeBCe)/2. The 
information for 9 when ip is unknown is then 

= i[tr(BC e BC e )-tr(BC,)7p], 
so the information loss in going from known to unknown margins is I^ g / 1^ = tr(BC6>) 2 / (2p). 
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Figure [T] plots Iqq and Iqq.^ for two different copula models in which the symmetry 
condition on BCg is satisfied. The first panel of the figure plots information bounds for the 
exchangeable correlation model with p = 4, in which Cor[Yj, Yk] = 9 for all j 7^ k. For this 
model, straightforward calculations show that 



he — 
lee-ip 



1 



~ 2 P(P1 2 - 27 + l)/2, where 7 _1 = 1 + (p - 1)6, 
p{p - 1) 



2((p- 1)0 + 1) 2 (1 -^) 2 ' 

which of course reduces to (1 — 2 )~ 2 , the information bound for the bivariate normal copula 
model, when p — 2. Note that the parameter space is O = (— (p — 1), and that there is 
very little information loss for 9 < relative to the loss for 9 > 0. 

The second panel of the figure gives information bounds for a 4 x 4 circular correlation 
matrix for which the first row is C\{9) = (1, 9, 9 2 , 9). For this model, we have 

4 



h 
h 



1 



9 2 ) 2 
4 



;i+2n 



[1-9 2 ) 2 ' 

Note that the information bounds for this model are symmetric about 9 = 0. 




Figure 1: Asymptotic information bounds for copula models with known and equal but 
unknown margins. The left panel gives I^ 1 and for the p = 4 exchangeable copula 
model, and the right panel gives bounds for a one parameter circular copula model. 
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5 LAN for general Gaussian copulas 



For a copula model with the symmetry in BC# described above, the limiting distribution of 
the rank likelihood ratio is the same as that of the likelihood ratio for the corresponding nor- 
mal model with common marginal variances, i.e. the model is "symmetric" in the variances. 
Analogously, one might expect that for a Gaussian copula model lacking this symmetry, the 
limiting distribution of A r might match that of the likelihood ratio for the corresponding 
normal model with "asymmetric," i.e. unequal marginal variances. This turns out to be 
correct, as we now show. 

Consider the class of mean-zero multivariate normal models with inverse-covariance ma- 
trix VarlYlO,^]- 1 = DO) 1/2 B(0)D(» 1/2 , where 6 € Rq and D (V>) is the diagonal matrix 
with diagonal elements if) G M. p . The log probability density for a member of this class is 
given by 

l(y) = (-plog27r + X>g^- + log |B| - 2/ t D(t/>) 1/2 BD(t/0 1/2 ?/) /2. 
The log-likelihood derivatives are 

i 9k (y) = MB 0k C) - y T B(^B 0k r>w) 1/2 y]/2 

l^y) = [l-y^ /2 bjB(^y]/(2^). 
Straightforward calculations show that 

= D(^)- 1 (I + BoC)D^)- 1 /4 
= -D- 1 (^)diag(BC e J/2, 

where "o" is Hadamard product denoting element-wise multiplication. The local log likeli- 
hood ratio for this model can be expressed as 

n 

vm) = -j=Y, aTi »<r*) - ^[f ] T/ [f ] +o P (i). 

V i=l 

As before, we take our rank based approximation A^ to be equal to X y absent the o p (l) 
term and with each Vj replaced by its approximate normal scores Y i. Clearly, we have 

\ ~ \ — °p(1) ^ 

-1= YyU&b + *%(Y-,)] - [s^eiY^ + t T UY t )} = o p (l). 
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Given 6 and s, we now identify a value of t for which the above asymptotic result holds. 
Let t = Us, where H E W xq , so that 

s T i e (y)+t T i^y) = s T [i e (y) + H T ^(y)] 

q 

k=l 

where {hk, k — 1, . . . q} are the columns of H. Now h k (y) and l%p{y) are both quadratic in y. 
Evaluating at ?/> = 1, we have k k {y) = [tr(Bg fe C) - ?/ T B 0fc ?/]/2 and i^(y) = [1 - yjbJy}/2, 
and so 

h%(y) = [/i T l- 1 / T D(/i)By]/2, 

where D(h.) is the diagonal matrix with elements h. Therefore, we can write s T [le{y) + 
H T l^(y)] as 

q 

s T [i e (y)+U%(y)) = £ s k {h k {y) + h%(y)] 

k=l 

J2 s ky T A k y \ +c(s,H,0) 

,fc=i / 

where c(s, H, 6) does not depend on y, and is given by 

A fc = -[B efc +D(h fc )B]/2. 

Note that if h k = h k l, i.e. all the values are common, then the value of A k reduces to the 
matrix A k in Section 4.2, in the case of equal marginal variances. 

Substituting this representation of s T l e + t T l^, into A^ and gives 

X y~ \ = J2 Sk [-^J^Y.AkYi-YiAkYA + o p (l). 



k=i \ v i=i 



Theorem implies that this difference will converge in probability to zero if the diagonal 
elements of (A k + A^)C are zero for each k = 1, . . . , q. The value of (A fc + A^)C can be 
calculated as 

2(A fc + A^)C = -2xB flt C-D(^)BC-BD(^)C 
= 2xBC 0fc -(D(h fe )+BD(h fc )C). 
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The vector diag(D(/ifc) + BD(hfc)C) can be written as 

fh kl + hi (61 o ci)\ 



diag(D(M + BD(h,)C) 



and so our condition on h k becomes 



(I + BoC)/ii 



\^fc P + hi (b p o Cp ) y 



(I + BoC)h fe 



2 x diag(BC e J 

2(1 + B o C)- 1 diag(BC,J 



This result allows us to find the asymptotic distribution of the local rank likelihood ratio 
X r (s) for any smoothly parameterized normal copula model with continuous margins. Given 
such a copula model, we can form the local log likelihood ratio of the corresponding mul- 
tivariate normal model with unequal variances, \ y (s,t). If we set t = —I^plpIrpoS then by 
Theorem 3.2 we have \y(s,t) — \ y (s,t) = o p {l) under i.i.d. sampling from N(O,C(0)). By 
Theorem 2.4, the limiting distribution of A r (s) under i.i.d. sampling from the correspond- 
ing copula is therefore N(-s T I ee .^s/2, s T I e0 .^s), where I 00 .^ = I ee - iJ^I^Ie^ is the 
information for in the parametric normal model. 

Theorem 5.1. Let {C(6) : G C IR 9 } be a collection of correlation matrices such that 
C(6) is twice differentiable. If Y\, . . . , Y n are i.i.d. from a population with continuous 
marginal distributions and copula C(6) for some 6 0, then the distribution of the rank 
likelihood ratio A r (s) converges to a iV(— s T Iqq.^s /2, s 7 Igg.^s) distribution, where Iqo^ is 
the information for in the normal model with correlation C(6) and marginal precisions 



This theorem generalizes the result of Theorem 4.1 regarding models for which the diag- 



onal elements of BC^ are equal for each k. Under that condition, the information matrix 
1 00.^ for 6 when the marginal variances are equal is the same as Iee-ip, the information for 6 
when the variances are unequal. 



5.1 Example: First order autoregressive copula 

The first-order autoregressive correlation model can be parameterized as C(6) = {cj^{9) 
q\o-H y This simple one-parameter model does not satisfy the conditions of Theorem 



4.1 



and 



so the limiting distribution of the rank likelihood in this case is equal to that of the likelihood 
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for the normal model with unequal marginal variances. For illustration, we compute for this 
model Igg, Igg.^ and Igg.^p, the information functions for 9 under known, unknown but equal, 
and unknown and unequal marginal variances. 

For a generic one-parameter Gaussian copula, let Ig^ be the off-diagonal block of the 
information matrix in the unequal variance model, and let Ig^ be the corresponding element 
of the information matrix in the equal variance model. Under — 1 and ip = 1 respectively, 
we have Ig^ = — diag(BCe)/2 and Ig^ = — tr(BCe)/2 = lL,l. Letting d = diag(BC6»), the 
information in 9 under these two models can then be written as 

Ig.^ = Igg-d T [ ll T /(2p) ]d 
Ig.^ = Igg-d T [ (B o C + I)" 1 ]d. 



We have Ig.^ > Ig.^ with equality if the conditions of Theorem 4.1 are met, i.e. the diagonal 
elements of BC# are all equal. 

The first panel of Figure [2] plots Igg, Ig.^ and Ig.^p for the first-order autoregressive model 
with p = 4. Note that the information loss in going from equal to unequal variances is quite 
small, as compared to going from known to unknown marginal variances. The second panel 
of the figure highlights these differences. 




-1.0 -0.5 0.0 0.5 1.0 




-1.0 -0.5 0.0 0.5 1.0 



Figure 2: Asymptotic information bounds for the first-order autoregressive normal copula 
model with known, equal but unknown margins, and unequal and unknown margins. The 
right panel gives the bounds for the three cases, and the second panel gives the differences 
Igg\ — Igg and Iqq, , — Igg\, the latter one being the smaller of the two. 
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6 Discussion 



The partial sufficiency of the multivariate ranks in semiparametric copulas models suggests 
the existence of asymptotically efficient rank-based estimators of copula parameters. For the 
one-parameter bivariate Gaussian copula model, the rank-based pseudo-likelihood estimator 



of Genest et al. 1995 is asymptotically equivalent to the normal scores correlation coefficient, 



which Klaassen and Wellner 1997 showed to be asymptotically efficient. For other copula 



models, the existence of efficient rank-based estimators is an open question. Genest and 



Werker 2002 showed with a non-Gaussian example that the pseudo-likelihood estimator is 



not generally asymptotically efficient. However, this does not rule out the possibility that 
other rank-based estimators are asymptotically efficient, or that pseudo-likelihood estimators 
would be efficient for general Gaussian copula models. 

A natural candidate for an efficient rank-based estimator is the maximizer of the rank 
likelihood. However, whereas the pseudo-likelihood is a very explicit function of the copula 
density (making optimization and asymptotic analysis tractable), the rank likelihood involves 
a multivariate integral over a set of order constraints, the number of which grows with the 
sample size. While rank likelihood copula estimation can be made feasible with standard 



Markov chain Monte Carlo integration techniques Hoff 2007 , an asymptotic analysis of the 
rank likelihood seems to require techniques tailored to this particular problem. 

In this article, we have shown that the existence of a sufficiently accurate rank measurable 
approximation to a parametric submodel implies the local asymptotic normality of the rank 
likelihood. We have also shown that such approximations exist for every smoothly parame- 
terized Gaussian copula model. For such a copula model, the asymptotic information bound 
implied by the rank likelihood matches that of the corresponding parametric multivariate 
normal submodel. This result suggests the possibility of asymptotically efficient rank-based 
estimators for Gaussian copula models: Generally speaking, the information I r based on 
the ranks is less than or equal to the the semiparametric information If based on the full 
data, as the ranks are functions of the full data Le Cam and Yang, 1988 . Furthermore, the 
semiparametric information based on the full data is less than or equal to I p , the infimum 
of information functions over all parametric submodels, and so I r < If < I p in general. On 
the other hand, for Gaussian copula models we have shown that I r is equal to the infor- 
mation for a particular parametric submodel, the corresponding multivariate normal model. 
This implies that for a given Gaussian copula model, the corresponding multivariate normal 



model is least favorable, that L. 



I p and therefore I r 



Based on this result, 
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we conjecture that maximum likelihood estimators based on rank likelihoods are asymptoti- 
cally efficient for Gaussian copula models, and possibly more generally whenever information 
bounds based on the complete data for the semiparametric model in question exist. 
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